BPEL4Job: A Fault-Handling Design for Job Flow Management

نویسندگان

  • Wei Tan
  • Liana L. Fong
  • Norman Bobroff
چکیده

Workflow technology is an emerging paradigm for systematic modeling and orchestration of job flow for enterprise and scientific applications. This paper introduces BPEL4Job, a BPEL-based design for fault handling of job flow in a distributed computing environment. The features of the proposed design include: a two-stage approach for job flow modeling that separates base flow structure from fault-handling policy, a generic job proxy that isolates the interaction complexity between the flow engine and the job scheduler, and a method for migrating flow instances between different flow engines for fault handling in a distributed system. An implementation of the design based on a set of industrial products from IBM is presented and validated using a Montage application.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design of a Fault-tolerant Job-flow Manager for Grid Environments Using Standard Technologies, Job-flow Patterns, and a Transparent Proxy

The execution of job flow applications is a reality today in academic and industrial domains. Current approaches to execution of job flows often follow proprietary solutions on expressing the job flows and do not leverage recurrent job-flow patterns to address faults in Grid computing environments. In this paper, we provide a design solution to development of job-flow managers that uses standar...

متن کامل

Design and Implementation of a Fault Tolerant Job Flow Manager Using Job Flow Patterns and Recovery Policies

Currently, many grid applications are developed as job flows that are composed of multiple jobs. The execution of job flows requires the support of a job flow manager and a job scheduler. Due to the long running nature of job flows, the support for fault tolerance and recovery policies is especially important. This support is inherently complicated due to the sequencing and dependency of jobs w...

متن کامل

A Survey on Fault Tolerance in Work flow Management and Scheduling

Fault Tolerance is a configuration that prevent a computer or network device from failing in the event of unexpected problem or error such as hardware failure, link failure, unauthorized access, variations in the configuration of different systems and system running out of memory or disk space. The integration of fault tolerance measures with scheduling gains much importance. Workflow managemen...

متن کامل

Fuxi: a Fault-Tolerant Resource Management and Job Scheduling System at Internet Scale

Scalability and fault-tolerance are two fundamental challenges for all distributed computing at Internet scale. Despite many recent advances from both academia and industry, these two problems are still far from settled. In this paper, we present Fuxi, a resource management and job scheduling system that is capable of handling the kind of workload at Alibaba where hundreds of terabytes of data ...

متن کامل

Improving for Drum_Buffer_Rope material flow management with attention to second bottlenecks and free goods in a job shop environment

Drum–Buffer–Rope is a theory of constraints production planning methodology that operates by developing a schedule for the system’s first bottleneck. The first bottleneck is the bottleneck with the highest utilization. In the theory of constraints, any job that is not processed at the first bottleneck is referred to as a free good. Free goods do not use capacity at the first bottleneck, so very...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007